# Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature
This is the official implementation for the paper "Curvature Clues: Decoding Deep Learning Privacy with Input Loss Curvature"

## Environment
This code is tested and validated with py3.9 and py3.11. To replicate our environment please use the `environment.yml` file provided by running
```
conda env update -n py3.9_curv_clues --file environment.yml
```
## Code Flow
1. **Training Shadow Models** The first step is to train shadow models whose code can be found in `./train` directory.
2. **Compute Scores** Next we compute the scores for various MIA methods, this code is found in `./precompute_scores` directory. The code uses Azure blob storage to save the scores, modifications to the code maybe needed to save locally.
3. **Results** Next we fetch the precomputed scores to get the results. The code for which are in the root directory and correspond to the notebook (i.e. `*.ipynb` files) 

## Training Shadow Models

### Setup
Create a folder ```./pretrained/<dataset name>``` and ```./pretrained/<dataset name>/temp```
i.e. 
```
mkdir pretrained
mkdir pretrained/cifar100
mkdir pretrained/cifar100/temp
```
and copy the training file under question to the root directory.

### Training CIFAR10 shadow models
To train CIFAR10 shadow models set the `data_dir` variable in `./scripts/train_cifar10_shadow_models.sh` and run
```
sh train_cifar10_shadow_models.sh
``` 

### Training CIFAR100 shadow models
To train CIFAR100 shadow models set the `data_dir` variable in `./scripts/train_cifar10_shadow_models.sh` and run
```
sh train_cifar100_shadow_models.sh
``` 

### ImageNet shadow models
For ImageNet we use pre-trained models from [Feldman and Zhang](https://github.com/google-research/heldout-influence-estimation). We provide code to convert these models to pytorch in `./train/imagenet_shadow_models` directory. Set the path to ImageNet `libdata/indexed_tfrecords.py`. 

1. Please download `imagenet_index.npz` from [Feldman and Zhang](https://github.com/google-research/heldout-influence-estimation) and place it in `build_fz_imagenet/`
2. Use the `build_imagenet.py` in the `build_fz_imagenet` directory to convert to TFRecord dataset.
3. Place the datasets in the following directory structure
4. Download the models
5. Copy the files from `./train/imagenet_shadow_models` to the root directory and run

```
python convert_imagenet_models_tf_2_torch.py --model_dir <path to where models were downloaded from Feldman and Zhang>
```

The `train` directory also has the code to train dp (`train/train_dp.py`) models, random subsets (`train/train_cifar100_random_samples.py`) and curvature subsets (`train/train_cifar100_low_curv_samples.py`) on CIFAR100.

## Compute Scores

### CIFAR100
To calculate the scores on CIFAR100 set the `data_dir` variable in `./scripts/precompute_cifar100_scores.sh` copy the files to the root directory and run
```
sh precompute_cifar100_scores.sh
```

### CIFAR10
To calculate the scores on CIFAR10 set the `data_dir` variable in `./scripts/precompute_cifar10_scores.sh` copy the files to the root directory and run
```
sh precompute_cifar10_scores.sh
```
### ImageNet
To calculate the scores on ImageNet set the copy the files to the root directory and run
```
sh precompute_imagenet_scores.sh
```

## Results

We describe the files and implementations below

| File     | Description         |
|----------|---------------------|
| `conditonal_mia_aug_cifar10.ipynb` | Provides the results for MIA attack using various methods and reproduces results from Table 1 for CIFAR10 |
| `conditonal_mia_aug_cifar100.ipynb` | Provides the results for MIA attack using various methods and reproduces results from Table 1 for CIFAR100 |
| `conditonal_mia_aug_imagenet.ipynb` | Provides the results for MIA attack using various methods and reproduces results from Table 1 for ImageNet |
| `conditonal_mia_aug_v_m.ipynb` | Provides the results for experiments under `Effect of Dataset Size` section of the paper |
| `conditonal_mia_aug_dp.ipynb` | Provides the results for experiments under `Effect of Privacy` section of the paper |
